Finite-sum Composition Optimization via Variance Reduced Gradient Descent
نویسندگان
چکیده
The stochastic composition optimization proposed recently by Wang et al. [2014] minimizes the objective with the compositional expectation form: minx (EiFi ◦ EjGj)(x). It summarizes many important applications in machine learning, statistics, and finance. In this paper, we consider the finite-sum scenario for composition optimization: min x f (x) := 1 n n ∑ i=1 Fi ( 1 m m ∑ j=1 Gj(x) ) . We propose two algorithms to solve this problem by combining the stochastic compositional gradient descent (SCGD) and the stochastic variance reduced gradient (SVRG) technique. A constant linear convergence rate is proved for strongly convex optimization, which substantially improves the sublinear rate O(K−0.8) of the best known algorithm.
منابع مشابه
Improved Oracle Complexity of Variance Reduced Methods for Nonsmooth Convex Stochastic Composition Optimization
We consider the nonsmooth convex composition optimization problem where the objective isa composition of two finite-sum functions and analyze stochastic compositional variance reducedgradient (SCVRG) methods for them. SCVRG and its variants have recently drawn much atten-tion given their edge over stochastic compositional gradient descent (SCGD); but the theoreticalanalysis ...
متن کاملStochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure
Stochastic optimization algorithms with variance reduction have proven successful for minimizing large finite sums of functions. However, in the context of empirical risk minimization, it is often helpful to augment the training set by considering random perturbations of input examples. In this case, the objective is no longer a finite sum, and the main candidate for optimization is the stochas...
متن کاملStochastic Variance Reduction for Nonconvex Optimization
We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (Svrg) methods for them. Svrg and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (Sgd); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary...
متن کاملParallel Asynchronous Stochastic Variance Reduction for Nonconvex Optimization
Nowadays, asynchronous parallel algorithms have received much attention in the optimization field due to the crucial demands for modern large-scale optimization problems. However, most asynchronous algorithms focus on convex problems. Analysis on nonconvex problems is lacking. For the Asynchronous Stochastic Descent (ASGD) algorithm, the best result from (Lian et al., 2015) can only achieve an ...
متن کاملVariance-Reduced Proximal Stochastic Gradient Descent for Non-convex Composite optimization
Here we study non-convex composite optimization: first, a finite-sum of smooth but non-convex functions, and second, a general function that admits a simple proximal mapping. Most research on stochastic methods for composite optimization assumes convexity or strong convexity of each function. In this paper, we extend this problem into the non-convex setting using variance reduction techniques, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017